One place that NumPy arrays really start excelling is that they are good at handling multi-dimensional data.
If you pass in a list-of-lists (of equal lengths) it will create a two-dimensional array:
import numpy as np
grid = np.array([[1, 2, 3], [4, 5, 6]])
print(grid)
You can ask any array
for how many dimensions it has:
grid.ndim
Or get more detail on exactly how large it is in each dimension with:
grid.shape
So it has two dimensions, one of size 2 and one size 3 (i.e. it's $2\times3$).
If you access a multi-dimensional array using a single number in the square brackets then it will give you the data by row. For example, to pull out the first row:
grid[0]
0 | 1 | 2 | 3 |
4 | 5 | 6 |
or the second row:
grid[1]
1 | 2 | 3 | |
1 | 4 | 5 | 6 |
Then you can get individual elements by specifying multiple indices inside the square brackets. For example, to get the first row and the second column:
grid[0, 1]
1 | |||
---|---|---|---|
0 | 1 | 2 | 3 |
4 | 5 | 6 |
This is useful as it also allows you to select the data by column using :
to mean "all":
grid[:, 1]
1 | |||
---|---|---|---|
: | 1 | 2 | 3 |
4 | 5 | 6 |
So, when the array is printed out, the order that you put indices into the []
when indexing first selects the "row" and then selects the "column". In NumPy-speak these are called axes and are numbered from 0 in order. In the example of our grid, they are:
axis 1 → | |||
---|---|---|---|
axis 0 → | 1 | 2 | 3 |
4 | 5 | 6 |
NumPy arrays can handle any number of dimensions up to 32. So you can make a three-dimensional cube of numbers with:
cube = np.array([[[1,2], [3, 4]], [[5, 6], [7, 8]]])
print(cube)
You can index the values in it just as before, separating each axes index with a comma:
cube[0, 0, 0]
Notice how the array is represented when it's printed. Each "row" of the cube is now a 2-D sub-array, separated by an blank line.
In labelled diagram form, it looks like:
axis 0 → |
|
||||||||
|
See that the numbers along the row are always the last axis (they were axis 1 before, now they're axis 2).
cube
.cube
:8
[1 2]
[2 6]
[[3 4]
[7 8]]
Just like with one-dimensional array we can create multi-dimensional data with the built-in functions such as np.zeroes
:
np.zeros(shape=(2, 7))
np.ones
:
np.ones(shape=(3, 5))
and even more complex functions such as np.fromfunction
which takes an argument of a function which, when called with $x$ and $y$ values, returns a corresponding output value for that coördinate. So something like
can be created with:
def trig(x, y):
x = np.radians(x) # Convert from degrees to radians
y = np.radians(y) # Convert from degrees to radians
return np.sin(x) - np.cos(y) + y/4
trig_grid = np.fromfunction(trig, shape=(720, 720))
trig_grid
import matplotlib.pyplot as plt
fig, ax = plt.subplots() # Create the plotting area
im = ax.imshow(trig_grid) # Display the array as an image
This takes any 2D array and displays it as if it were an image. It will set the colour based on the value. One thing to be aware of is that the orign of the axes is in the top-left, rather than the bottom-left (due to conventions on how image data is stored). You can change this by passing origin="lower"
to the function:
fig, ax = plt.subplots()
im = ax.imshow(trig_grid, origin="lower") # Added the `origin` argument
It's also useful to be able to display a colour bar as well the data itself, which you can do with fig.colorbar
and passing it the return value of the imshow
call:
fig, ax = plt.subplots()
im = ax.imshow(trig_grid, origin="lower")
fig.colorbar(im) # Added this line
You can change the colour scheme used to plot with the cmap
argument:
fig, ax = plt.subplots()
im = ax.imshow(trig_grid, origin="lower", cmap="inferno") # Added the `cmap` argument
fig.colorbar(im)
There are other two-dimensional (and even three-dimensional) plotting types available, such as the contour
plot:
fig, ax = plt.subplots()
im = ax.contour(trig_grid, cmap="inferno") # Changed from `imshow` to `contour`
fig.colorbar(im)
From the same file that we downloaded before, grab the "temperature"
array. This data is based on ECMWF data and is in units of Kelvin. It is three-dimensional with axes of altitude, latitude and longitude. The altitude axis is layered such that the 0th layer is ground-level and each layer beyond that increases in altitude.
You can load in the data with:
with np.load("weather_data.npz") as weather:
temperature = weather["temperature"]
Since this is three-dimensional data, it's worth exploring its shape first. Once you understand the data:
axis=0
argument to np.mean
to have it average along one axis only